Inside ChatGPT’s GPT 5 Search: What the Configuration Files Reveal About How It Ranks Your Content

Published: 2025-08-20T12:43:54+03:00 Last Updated: 2025-11-03T01:14:57+03:00

Metehan Yesilyurt • Research • 0 Comments

Share at:

ChatGPT Perplexity WhatsApp LinkedIn X Grok Google AI

An analysis of actual ChatGPT configuration settings and what they mean for content creators. Many probabilities…

If you’ve ever wondered how ChatGPT decides which websites to reference when answering questions, we’ve got some fascinating insights. Recent configuration data from ChatGPT’s production environment reveals the exact settings that govern how it searches, retrieves, and ranks web content.

Previously, I wrote about Reciprocal Rank Fusion.

No speculation, no guesswork, just the actual configuration parameters that determine whether your content makes it into ChatGPT’s responses. You can see it yourself in the source code.

Just visit any past chat window, click the right and “View Source Code”

CTRL/Command + F -> “rerank”

The Reranking Model: ret-rr-skysight-v3

At the heart of ChatGPT’s retrieval system is a reranking model with the cryptic name ret-rr-skysight-v3. This isn’t just a simple search algorithm; it’s a sophisticated post-processing layer that takes initial search results and completely reorders them based on quality signals.

reranker_model: "ret-rr-skysight-v3"

This single line of configuration confirms what many suspected: ChatGPT doesn’t just grab the first search results it finds. Instead, it retrieves a larger set of potential sources and then applies this reranker to identify the most relevant and authoritative content.

Freshness Is King: The Scoring Profile

Perhaps the most significant finding for content creators is this setting:

use_freshness_scoring_profile: true

This can confirm that ChatGPT actively prioritizes recent content over older material. It’s not just looking at publication dates, it’s using a dedicated “freshness scoring profile” to weight newer information more heavily. Isn’t it? What do you think?

For website owners, this can be crucial: that comprehensive guide you wrote in 2022? It might be losing ground to newer content, even if yours is more detailed. Regular content updates aren’t just good practice; they’re essential for ChatGPT visibility.

Here is another evidence for freshness

**enable_source_specific_search_params**: `retrieval_additional_system_prompt`

**The user may have connected sources. If they have, you can assist the user by searching over documents from their connected sources, using the file_search tool. For example, this may include documents from their Google Drive, or files from their Dropbox. The exact sources (if any) will be mentioned to you in a follow-up message.

Use the file_search tool to assist users when their request may be related to information from connected sources, such as questions about their projects, plans, documents, or schedules, BUT ONLY IF IT IS CLEAR THAT the user’s query requires it; if ambiguous, and especially if asking about something that is clearly common knowledge, or better answerable from a different tool, DO NOT SEARCH SOURCES. Use the `web` tool instead when the user asks about recent events / fresh information, or asks about news etc. Conversely, if the user’s query clearly expects you to reference / read some non-public resource, it is likely that they are expecting you to search connectors.

Note that the file_search tool allows you to search through the connected soures, and interact with the results. However, you do not have the ability to _exhaustively_ list documents from the corpus and you should inform the user you cannot help with such requests. Examples of requests you should refuse are ‘What are the names of all my documents?’ or ‘What are the files that need improvement?’

IMPORTANT: Your answers, when relating to information from connected sources, must be detailed, in multiple sections (with headings) and paragraphs. You MUST use Markdown syntax in these, and include a significant level of detail, covering ALL key facts. However, do not repeat yourself. Remember that you can call file_search more than once before responding to the user if necessary to gather all information.

**Capabilities limitations**:

– You do not have the ability to exhaustively list documents from the corpus.

– You also cannot access to any folders information and you should inform the user you cannot help with folder-level related request. Examples of requests you should refuse are ‘What are the names of all my documents?’ or ‘What are the files that need improvement?’ or ‘What are the files in folder X?’.

– Also, you cannot directly write the file back to Google Drive.

– For Google Sheets or CSV file analysis: If a user requests analysis of spreadsheet files that were previously retrieved – do NOT simulate the data, either extract the real data fully or ask the users to upload the files directly into the chat to proceed with advanced analysis.

– You cannot monitor file changes in Google Drive or other connectors. Do not offer to do so.**: `enable_dynamic_prompt`

The Multi-Layer Filtering System

The configuration reveals a sophisticated filtering pipeline with multiple checkpoints:

enable_query_intent: true
enable_source_filtering: true
enable_mimetype_filtering: true
vocabulary_search_enabled: true
use_coarse_grained_filters_for_vocabulary_search: false

Let’s break down what each of these means:

Query Intent Detection

With enable_query_intent: true, ChatGPT analyzes what the user is actually trying to accomplish. It’s not just matching keywords, it’s understanding whether someone wants a definition, a how-to guide, a comparison, or something else entirely.

Vocabulary Search: The Domain Expert Advantage

Here’s where it gets interesting:

vocabulary_search_enabled: true
use_coarse_grained_filters_for_vocabulary_search: false

ChatGPT may use vocabulary-aware searching with fine-grained (not coarse) filters(probably fine-grained!). This means it recognizes and prioritizes domain-specific terminology. Sites that consistently use proper industry terminology and define their terms have a built-in advantage.

The Mystery Settings: What I Don’t Know

Interestingly, one relevance feature is explicitly disabled:

use_relevance_lmp: false

I don’t know what “LMP” stands for, but we know ChatGPT has chosen NOT to use it. This suggests the system relies on other relevance signals, possibly more traditional information retrieval methods combined with the neural reranker.

Similarly, these features are enabled but their exact purpose remains unclear:

enable_mclick_urls: true
enable_mclick_dates: true
use_light_weight_scoring_for_slurm_tenants: true
enable_source_specific_search_params: true

The “mclick” features might relate to multi-click behavior or tracking how users interact with multiple sources. Or just mobile clicks?

What about slurm?

There is another configuration in ChatGPT.

enabledConnectors: [
“gdrive_action_connector”,
“slurm_dropbox”, // <– slurm prefix
“dropbox_connector”,
“slurm_sharepoint”, // <– slurm prefix
“sharepoint_connector”,
“slurm_box”, // <– slurm prefix
“box_connector”,
“slurm_canva”, // <– slurm prefix
“canva_connector”,
“slurm_notion”, // <– slurm prefix
“notion_connector”,
“hubspot_connector”,
“teams_connector”
]

This strongly suggests that “slurm” refers to these integrated third-party connectors/sources. The pattern shows:

Some connectors have a “slurm_” prefix version
These appear to be for major cloud storage and productivity platforms
The term “tenants” in this context likely means these connected external services

So use_light_weight_scoring_for_slurm_tenants: true actually means:

“Use lightweight/efficient scoring when searching through connected third-party services like Dropbox, SharePoint, Box, Canva, and Notion”

This makes much more sense! It means:

When ChatGPT searches your connected Google Drive, Dropbox, or Notion
It uses a lighter, faster scoring algorithm
Probably because these are private/trusted sources that don’t need the same heavy ranking as web results

This is actually quite different from web search. ChatGPT may employ various retrieval strategies for different purposes.

Public web content -> Full reranking and scoring
Connected personal/work accounts -> Lightweight scoring

The results/citations even change dynamically if users connect their own sources!

What This Means for Your Content Strategy

Based on these confirmed settings, here’s what actually matters:

1. Update Frequency Beats Static Perfection

That freshness scoring profile isn’t optional; it’s always on. Even perfect content grows stale in ChatGPT’s eyes.

2. Intent Alignment Is Critical

With query intent detection active, your content needs to clearly signal what type of information it provides. A product comparison should look and read like a comparison, not a blog post pretending to be one.

3. Technical Vocabulary Matters

The vocabulary search system rewards proper use of industry terminology.

4. The Reranker Changes A Lot

Initial search visibility can’t be enough. The ret-rr-skysight-v3 reranker will reshuffle everything based on quality signals we can only partially understand. Focus on comprehensive, authoritative content that would survive any reordering.

The Configuration Doesn’t Lie or…?

These are actual production settings from ChatGPT’s retrieval system. Every true and false in this configuration directly impacts whether your content appears in AI-generated responses.

The most striking revelation? The complexity of the filtering and ranking pipeline. This isn’t a simple search engine; it’s a multi-stage retrieval system with intent detection, vocabulary analysis, freshness scoring, source filtering, and neural reranking all working in concert.

For content creators, the message is clear: optimize for substance, freshness, and clarity. The configuration shows ChatGPT is looking for recent, relevant, technically accurate content that clearly matches user intent.

Gaming this system would require fooling multiple independent filters and a sophisticated neural reranker. Instead, focus on what the configuration implicitly rewards: becoming the most current, comprehensive, and authoritative source in your niche.

Note: This analysis is based on configuration data from a ChatGPT Plus user session in August 2025. Settings may vary by user type, region, or over time as OpenAI updates its systems.

See a comprehensive list here;

https://github.com/metehan777/chatgpt-5-configuration-analysis